37 research outputs found

    Parameters of Cross-linguistic Variation in Expectation-based Minimalist Grammars (e-MGs)

    Get PDF
    The fact that Parsing and Generation share the same grammatical knowledge is often considered the null hypothesis (Momma and Phillips 2018) but very few algorithms can take advantage of a cognitively plausible incremental procedure that operates roughly in the way words are produced and understood in real time. This is especially difficult if we consider cross-linguistic variation that has a clear impact on word order. In this paper, I present one such formalism, dubbed Expectation-based Minimalist Grammar (e-MG), that qualifies as a simplified version of the (Conflated) Minimalist Grammars, (C)MGs (Stabler 1997, 2011, 2013), and Phase-based Minimalist Grammars, PMGs (Chesi 2005, 2007; Stabler 2011). The crucial simplification consists of driving structure building only using lexically encoded categorial top-down expectations. The commitment to the top-down procedure (in e-MGs and PMGs, as opposed to (C)MGs, ) will be crucial to capture a relevant set of empirical asymmetries in a parameterized cross-linguistic perspective which represents the least common denominator of structure building in both Parsing and Generation

    An efficient Trie for binding (and movement)

    Get PDF
    Non-local dependencies connecting distant structural chunks are often modeled using (LIFO) memory buffers (see Chesi 2012 for a review). Other solutions (e.g. slash features in HPSG, Pollard & Sag 1994) are not directly usable both in parsing and in generation algorithms without undermining an incremental left-right processing assumption. Memory buffers are however empirically limited and psycholinguistically invalid (Nairne 2002). Here I propose to adopt Trie memories instead of stacks. This leads to simpler and more transparent solutions for establishing non-local dependencies both for wh- argumental configurations and for anaphoric pronominal coreference.Nell’implementazione di dipendenze non locali che mettano in connessione due costituenti arbitrariamente distanti in una struttura frasale, spesso si è ricorsi all’uso di memorie a pila (LIFO; si veda Chesi 2012 per una panoramica sul tema). Le altre soluzioni proposte (e.g. tratti slash in HPSG, Pollard & Sag 1994) non risultano implementabili in modo trasparente, né in generazione né in parsing, con algoritmi che tengano conto del requisito di incrementalità del processamento. Tuttavia, viste le limitazioni psicolinguistiche ed empiriche delle memorie a pila (Nairne 2002), qui si propone di adottare memorie di tipo Trie per codificare i tratti rilevanti nello stabilire dipendenze non locali nel caso di strutture che impiegano elementi wh- argomentali e nel legamento pronominale anaforico

    Asymmetries in extraction from nominal copular sentences: A challenging case study for NLP tools

    Get PDF
    In this paper we discuss two types of nominal copular sentences (Canonical and Inverse, Moro 1997) and we demonstrate how the peculiarities of these two configurations are hardly considered by standard NLP tools that are currently publicly available. Here we show that example-based MT tools (e.g. Google Translate) as well as other NLP tools (UDpipe, LinguA, Stanford Parser, and Google Cloud AI API) fail in capturing the critical distinctions between the two structures in the end producing both wrong analyses and, possibly as a consequence of a non-coherent (or missing) structural analysis, incorrect translations in the case of MT tools. To support the proposed analysis, we present also an empirical study showing that native speakers are indeed sensitive to the critical distinctions. This poses a sharp challenge for NLP tools that aim at being cognitively plausible or at least descriptively adequate (Chowdhury & Zamparelli 2018)

    The Role of D-Linking and Lexical Restriction in Locality Violations

    Get PDF
    The major contrast discussed in the literature to show an obviation of the wh-island effect often involves a bare wh- element in the role of the intervener (e.g. who) and a “complex” wh-phrase (e.g. which book) in the role of the moved item. This contrast is not minimal, since it is not sufficient to disentangle the role of D-linking (Pesetsky 1987) from that of the so-called “lexical restriction” (Friedmann, Belletti, and Rizzi 2009). In this work we try to fill this gap by contrasting, in an argumental wh- island configuration (e.g. “… [who read …]”), which NP vs. what NP both in English and in Italian (e.g. which/what book and quale/che libro). We argue that while both wh- phrases can be genuinely considered “lexically restricted”, the first, and not the second, has properties that make it allegedly D-linked (i.e. a canonical partitive interpretation is available). Our acceptability studies show that (in both languages) no significant difference is revealed in the scores attributed to the two extracted wh-phrases and no significant variance (e.g. indicating a binomial distribution) is observed in the condition what NP. The first result indicates that the “D-linking” hypothesis as an independent source of amelioration is inadequate; the second result suggests that also the hypothesis that the condition what/che NP might be ambiguous between a D-linked and a non-D-linked reading is unlikely

    The first Mirandese text-to-speech system 

    Get PDF
    This paper describes the creation of base NLP resources and tools for an under-resourced minority language spoken in Portugal, Mirandese, in the context of the generation of a text-to-speech system, a collaborative citizenship project between Microsoft, ILTEC, and ALM – Associaçon de la Lhéngua Mirandesa. Development efforts encompassed the compilation of a large textual corpus, definition of a complete phone-set, development of a tokenizer, inflector, TN and GTP modules, and creation of a large phonetic lexicon with syllable segmentation, stress mark-up, and POS. The TTS system will provide an open access web interface freely available to the community, along with the other resources. We took advantage of mature tools, resources, and processes already available for phylogenetically-close languages, allowing us to cut development time and resources to a great extent, a solution that can be viable for other lesser-spoken languages which enjoy a similar situation.National Foreign Language Resource Cente

    Proceedings of the Fifth Italian Conference on Computational Linguistics CLiC-it 2018

    Get PDF
    On behalf of the Program Committee, a very warm welcome to the Fifth Italian Conference on Computational Linguistics (CLiC-­‐it 2018). This edition of the conference is held in Torino. The conference is locally organised by the University of Torino and hosted into its prestigious main lecture hall “Cavallerizza Reale”. The CLiC-­‐it conference series is an initiative of the Italian Association for Computational Linguistics (AILC) which, after five years of activity, has clearly established itself as the premier national forum for research and development in the fields of Computational Linguistics and Natural Language Processing, where leading researchers and practitioners from academia and industry meet to share their research results, experiences, and challenges

    EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020

    Get PDF
    Welcome to EVALITA 2020! EVALITA is the evaluation campaign of Natural Language Processing and Speech Tools for Italian. EVALITA is an initiative of the Italian Association for Computational Linguistics (AILC, http://www.ai-lc.it) and it is endorsed by the Italian Association for Artificial Intelligence (AIxIA, http://www.aixia.it) and the Italian Association for Speech Sciences (AISV, http://www.aisv.it)
    corecore